Performance Evaluation of Analytical Queries on a Standalone and Sharded Document Store

نویسندگان

  • Aarthi Raghavendra
  • Karen C. Davis
چکیده

Numerous organizations perform data analytics using relational databases by executing data mining queries. These queries include complex joins and aggregate functions. However, due to an explosion of data in terms of volume, variety, veracity, and velocity known as Big Data [1], many organizations such as Foursquare, Adobe, and Bosch have migrated to NoSQL databases [2] such as MongoDB [3] and Cassandra [4]. We investigate the performance impact of analytical queries on a NoSQL document store. We benchmark the performance of MongoDB [3], a cross-platform document-oriented database, in a stand-alone environment and a sharded environment. The TPC-DS benchmark [5] is used to generate data of different scales and selected data mining queries are executed in both the environments. Our experimental results show that along with choosing the environment, data modeling in MongoDB also has a significant impact on query execution times. Analytical query performance is best when data is stored in a denormalized fashion. When the data is sharded, due to multiple query predicates in an analytical query, aggregating data from a few or all nodes proves to be an expensive process and hence performs poorly when compared to the alternative process of executing the same in a stand-alone environment.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Apply Uncertainty in Document-Oriented Database (MongoDB) Using F-XML

As moving to big data world where data is increasing in unstructured way with high velocity, there is a need of data-store to store this bundle amount of data. Traditionally, relational databases are used which are now not compatible to handle this large amount of data, so it is needed to move on to non-relational data-stores. In the current study, we have proposed an extension of the Mongo...

متن کامل

Apply Uncertainty in Document-Oriented Database (MongoDB) Using F-XML

As moving to big data world where data is increasing in unstructured way with high velocity, there is a need of data-store to store this bundle amount of data. Traditionally, relational databases are used which are now not compatible to handle this large amount of data, so it is needed to move on to non-relational data-stores. In the current study, we have proposed an extension of the Mongo...

متن کامل

ChronicleDB: A High-Performance Event Store

Reactive security monitoring, self-driving cars, the Internet of Things (IoT) and many other novel applications require systems for both writing events arriving at very high and fluctuating rates to persistent storage as well as supporting analytical ad-hoc queries. As standard database systems are not capable to deliver the required write performance, logbased systems, key-value stores and oth...

متن کامل

Investigating the Impact of Authors’ Rank in Bibliographic Networks on Expertise Retrieval

Background and Aim: this research investigates the impact of authors’ rank in Bibliographic networks on document-centered model of Expertise Retrieval. Its purpose is to find out what kind of authors’ ranking in bibliographic networks can improve the performance of document-centered model.   Methodology: Current research is an experimental one. To operationalize research goals, a new test colle...

متن کامل

مدل جدیدی برای جستجوی عبارت بر اساس کمینه جابه‌جایی وزن‌دار

Finding high-quality web pages is one of the most important tasks of search engines. The relevance between the documents found and the query searched depends on the user observation and increases the complexity of ranking algorithms. The other issue is that users often explore just the first 10 to 20 results while millions of pages related to a query may exist. So search engines have to use sui...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017